Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Molinaro, Nicola (Ed.)The picture naming task is common both as a clinical task and as a method to study the neural bases of speech production in the healthy brain. However, this task is not reflective of most naturally occurring productions, which tend to happen within a context, typically in dialogue in response to someone else’s production. How the brain basis of the classic “confrontation picture naming” task compares to the planning of utterances in dialogue is not known. Here we used magnetoencephalography (MEG) to measure neural activity associated with language production using the classic picture naming task as well as a minimal variant of the task, intended as more interactive or dialogue-like. We assessed how neural activity is affected by the interactive context in children, teenagers, and adults. The general pattern was that in adults, the interactive task elicited a robust sustained increase of activity in frontal and temporal cortices bilaterally, as compared to simple picture naming. This increase was present only in the left hemisphere in teenagers and was absent in children, who, in fact, showed the reverse effect. Thus our findings suggest a robustly bilateral neural basis for the coordination of interaction and a very slow developmental timeline for this network.more » « less
-
To enable building and testing models on long-document comprehension, we introduce QuALITY, a multiple-choice QA dataset with context passages in English that have an average length of about 5,000 tokens, much longer than typical current models can process. Unlike in prior work with passages, our questions are written and validated by contributors who have read the entire passage, rather than relying on summaries or excerpts. In addition, only half of the questions are answerable by annotators working under tight time constraints, indicating that skimming and simple search are not enough to consistently perform well. Our baseline models perform poorly on this task (55.4%) and significantly lag behind human performance (93.5%).more » « less
-
null (Ed.)We introduce The Benchmark of Linguistic Minimal Pairs (BLiMP), 1 a challenge set for evaluating the linguistic knowledge of language models (LMs) on major grammatical phenomena in English. BLiMP consists of 67 individual datasets, each containing 1,000 minimal pairs—that is, pairs of minimally different sentences that contrast in grammatical acceptability and isolate specific phenomenon in syntax, morphology, or semantics. We generate the data according to linguist-crafted grammar templates, and human aggregate agreement with the labels is 96.4%. We evaluate n-gram, LSTM, and Transformer (GPT-2 and Transformer-XL) LMs by observing whether they assign a higher probability to the acceptable sentence in each minimal pair. We find that state-of-the-art models identify morphological contrasts related to agreement reliably, but they struggle with some subtle semantic and syntactic phenomena, such as negative polarity items and extraction islands.more » « less
An official website of the United States government

Full Text Available